communication message
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- (2 more...)
Multi-Agent Reinforcement Learning with Communication-Constrained Priors
Yang, Guang, Yang, Tianpei, Qiao, Jingwen, Wu, Yanqing, Huo, Jing, Chen, Xingguo, Gao, Yang
Communication is one of the effective means to improve the learning of cooperative policy in multi-agent systems. However, in most real-world scenarios, lossy communication is a prevalent issue. Existing multi-agent reinforcement learning with communication, due to their limited scalability and robustness, struggles to apply to complex and dynamic real-world environments. To address these challenges, we propose a generalized communication-constrained model to uniformly characterize communication conditions across different scenarios. Based on this, we utilize it as a learning prior to distinguish between lossy and lossless messages for specific scenarios. Additionally, we decouple the impact of lossy and lossless messages on distributed decision-making, drawing on a dual mutual information estimatior, and introduce a communication-constrained multi-agent reinforcement learning framework, quantifying the impact of communication messages into the global reward. Finally, we validate the effectiveness of our approach across several communication-constrained benchmarks.
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.47)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
a03caec56cd82478bf197475b48c05f9-Supplemental.pdf
Algorithm 1 shows the pseudocode of LIAM.Algorithm 1 Pseudocode of LIAM's algorithmfor m = 1,...,M episodes do Reset the hidden state of the encoder LSTM Sample E fixed policies from Π Create E parallel environments and gather initial observations a The fixed policies in the predator-prey consist of a combination of heuristic and pretrained policies. First we created four heuristic policies, which are: (i) going after the prey, (ii) going after one of the predators, (iii) going after the agent (predator or prey) that is closest, (iv) going after the predator that is closest. CARL has access to the trajectories of all the other agents in the environment during training, but during execution only to the local trajectory. To extract such representations, we use self-supervised learning based on recent advances on contrastive learning [Oord et al., 2018, He et al., 2020, Chen et al., 2020a,b]. During training and given a batch of episode trajectories we construct the positive and negative pairs following Equation (4) and minimise the InfoNCE loss [Oord et al., 2018] Following the work of Chung et al. [2015] we can write the lower bound in the log-evidence of the We train LIAM-V AE similarly to LIAM.
Learning to Communicate in Multi-Agent Reinforcement Learning for Autonomous Cyber Defence
Contractor, Faizan, Li, Li, Mallah, Ranwa Al
Popular methods in cooperative Multi-Agent Reinforcement Learning with partially observable environments typically allow agents to act independently during execution, which may limit the coordinated effect of the trained policies. However, by sharing information such as known or suspected ongoing threats, effective communication can lead to improved decision-making in the cyber battle space. We propose a game design where defender agents learn to communicate and defend against imminent cyber threats by playing training games in the Cyber Operations Research Gym, using the Differentiable Inter Agent Learning algorithm adapted to the cyber operational environment. The tactical policies learned by these autonomous agents are akin to those of human experts during incident responses to avert cyber threats. In addition, the agents simultaneously learn minimal cost communication messages while learning their defence tactical policies.
- North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
- North America > Canada > Ontario > Kingston (0.14)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.53)
Learning Power Control Protocol for In-Factory 6G Subnetworks
Uyoata, Uyoata E., Berardinelli, Gilberto, Adeogun, Ramoni
In-X Subnetworks are envisioned to meet the stringent demands of short-range communication in diverse 6G use cases. In the context of In-Factory scenarios, effective power control is critical to mitigating the impact of interference resulting from potentially high subnetwork density. Existing approaches to power control in this domain have predominantly emphasized the data plane, often overlooking the impact of signaling overhead. Furthermore, prior work has typically adopted a network-centric perspective, relying on the assumption of complete and up-to-date channel state information (CSI) being readily available at the central controller. This paper introduces a novel multi-agent reinforcement learning (MARL) framework designed to enable access points to autonomously learn both signaling and power control protocols in an In-Factory Subnetwork environment. By formulating the problem as a partially observable Markov decision process (POMDP) and leveraging multi-agent proximal policy optimization (MAPPO), the proposed approach achieves significant advantages. The simulation results demonstrate that the learning-based method reduces signaling overhead by a factor of 8 while maintaining a buffer flush rate that lags the ideal "Genie" approach by only 5%.
- Telecommunications (0.68)
- Information Technology (0.46)
Language Grounded Multi-agent Reinforcement Learning with Human-interpretable Communication
Li, Huao, Mahjoub, Hossein Nourkhiz, Chalaki, Behdad, Tadiparthi, Vaishnav, Lee, Kwonjoon, Moradi-Pari, Ehsan, Lewis, Charles Michael, Sycara, Katia P
Multi-Agent Reinforcement Learning (MARL) methods have shown promise in enabling agents to learn a shared communication protocol from scratch and accomplish challenging team tasks. However, the learned language is usually not interpretable to humans or other agents not co-trained together, limiting its applicability in ad-hoc teamwork scenarios. In this work, we propose a novel computational pipeline that aligns the communication space between MARL agents with an embedding space of human natural language by grounding agent communications on synthetic data generated by embodied Large Language Models (LLMs) in interactive teamwork scenarios. Our results demonstrate that introducing language grounding not only maintains task performance but also accelerates the emergence of communication. Furthermore, the learned communication protocols exhibit zero-shot generalization capabilities in ad-hoc teamwork scenarios with unseen teammates and novel task states. This work presents a significant step toward enabling effective communication and collaboration between artificial agents and humans in real-world teamwork settings.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- (2 more...)
Leveraging Large Language Model for Heterogeneous Ad Hoc Teamwork Collaboration
Liu, Xinzhu, Li, Peiyan, Yang, Wenju, Guo, Di, Liu, Huaping
Abstract--Compared with the widely investigated homogeneous multi-robot collaboration, heterogeneous robots with different capabilities can provide a more efficient and flexible collaboration for more complex tasks. In this paper, we consider a more challenging heterogeneous ad hoc teamwork collaboration problem where an ad hoc robot joins an existing heterogeneous team for a shared goal. Specifically, the ad hoc robot collaborates with unknown teammates without prior coordination, and it is expected to generate an appropriate cooperation policy to improve the efficiency of the whole team. To solve this challenging problem, we leverage the remarkable potential of the large language model (LLM) to establish a decentralized heterogeneous ad hoc teamwork collaboration framework that focuses on generating reasonable policy for an ad hoc robot to collaborate with original heterogeneous teammates. A training-free hierarchical dynamic planner is developed using the LLM together with the newly proposed Interactive Reflection of Thoughts (IRoT) method for the ad hoc agent to adapt to different teams. Then, the new team collaborates and finally finishes the task. Imagine after a natural disaster such as an earthquake or team at any time from any location, and then a heterogeneous hurricane, a team of robots is dispatched for the rescue task. Since the situation of a disaster site is complex, robots of During the past years, the multi-robot collaboration task different capabilities may be required for the rescue. These has been widely investigated, and a bunch of multi-agent robots are likely to be brought from different places and thus embodied tasks are proposed where multiple agents learn arrive at the site at different times. The coming robot doesn't proper strategies to collaborate efficiently [17, 18, 22, 23, 42, have any prior information on existing teammates, and it is 44, 45, 52] and solve complex embodied tasks [27, 35]. All expected to collaborate efficiently and robustly with previously these works only consider homogeneous agents with the same unknown teammates for the same goal. However, in real-world applications, the robots describes a typical heterogeneous ad hoc teamwork, and the may be faced with more complicated situations such as seismic new coming robot is called an ad hoc robot. It is necessary to leverage heterogeneous ad hoc teamwork collaboration is demonstrated robots with different capabilities to accomplish the task better in Figure 1, where heterogeneous robots of different capabilities [14, 19, 36, 37, 41]. Meanwhile, the ad hoc teamwork can compose any team, and the original heterogeneous team collaboration is an important problem in the heterogeneous collaborates to execute a task. An ad hoc robot could join this multi-robot collaboration, which has been rarely addressed. Beijing University of Posts and Telecommunications, Beijing, China.
- Asia > China > Beijing > Beijing (0.44)
- North America > United States > New York > Richmond County > New York City (0.04)
- North America > United States > New York > Queens County > New York City (0.04)
- (3 more...)
Intent Profiling and Translation Through Emergent Communication
Mostafa, Salwa, Elbamby, Mohammed S., Abdel-Aziz, Mohamed K., Bennis, Mehdi
To effectively express and satisfy network application requirements, intent-based network management has emerged as a promising solution. In intent-based methods, users and applications express their intent in a high-level abstract language to the network. Although this abstraction simplifies network operation, it induces many challenges to efficiently express applications' intents and map them to different network capabilities. Therefore, in this work, we propose an AI-based framework for intent profiling and translation. We consider a scenario where applications interacting with the network express their needs for network services in their domain language. The machine-to-machine communication (i.e., between applications and the network) is complex since it requires networks to learn how to understand the domain languages of each application, which is neither practical nor scalable. Instead, a framework based on emergent communication is proposed for intent profiling, in which applications express their abstract quality-of-experience (QoE) intents to the network through emergent communication messages. Subsequently, the network learns how to interpret these communication messages and map them to network capabilities (i.e., slices) to guarantee the requested Quality-of-Service (QoS). Simulation results show that the proposed method outperforms self-learning slicing and other baselines, and achieves a performance close to the perfect knowledge baseline.